Model Selection

Video Instruction Understanding

# Video Instruction Understanding

Smolvlm2 2.2B Instruct GGUF

SmolVLM2-2.2B-Instruct is a 2.2B parameter vision-language model focused on video-text-to-text tasks, supporting English.

Smolvlm2 256M Video Instruct Mlx

This is a video-text-to-text model converted based on the MLX framework, suitable for video understanding and instruction-following tasks.

Transformers English

Smolvlm2 500M Video Instruct Mlx

This is a video-text-to-text model based on the MLX format, developed by HuggingFaceTB, supporting English language processing.

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase